Within
the Exchange organization, it is important to deploy multiple transport
servers to provide message path redundancy. Deploying multiple Hub
Transports in each Active Directory site automatically provides
redundancy and load balancing for message delivery. Deploying multiple
Edge Transport servers will also provide incoming and outgoing SMTP
redundancy.
1. Shadow Redundancy
Exchange Server 2010 includes the shadow
redundancy feature, which provides redundancy for messages for the
entire time they are in transit. This is in addition to the transport
dumpster. With one form of shadow
redundancy, the message deletion from the transport queue is delayed
until the transport server verifies that all of the next hops for that
message have completed delivery. If any of the next hops fail before
reporting successful delivery, the transport server resubmits the
message for delivery to that next hop. If the next hop server does not
support shadow redundancy, the message will be sent to the next hop and a shadow copy of the message will not be retained.
Shadow redundancy provides the following benefits:
It eliminates the reliance on the state of the transport server queues.
If redundant message paths exist, the state of any transport server
isn't relevant. If a transport server fails, you can simply remove it
from production without worrying about emptying its queues or losing
messages currently in transit.
If
maintenance needs to be performed on the transport server the server
can be brought offline without the risk of losing messages in transit.
It reduces the need for hardware redundancy for transport servers for messages in transit.
It consumes less bandwidth than other forms of redundancy that create duplicate copies of messages on multiple servers. With shadow redundancy the only added network traffic is the discard status being communicated between transport servers.
It
provides resilience and simplifies recovery from a transport server
failure because messages still in transit within the Exchange
organization are protected by the previous Exchange 2010 transport
server.
Note:
Shadow
redundancy does not protect messages in the transport dumpster, which
is essential in being able to recover messages in the case of a DAG
member failure.
One form of shadow redundancy is implemented by extending the SMTP protocol. These service extensions allow SMTP hosts to negotiate shadow redundancy support and communicate the discard status for shadowed messages.
The protocol implementation of shadow
redundancy works between Exchange 2010 transport servers. In the
following scenario, a message is sent from an Exchange 2010 mailbox out
to the Internet from a Hub Transport through an Edge Transport server,
as shown in Figure 1. In this case the message flow follows these stages:
Hub delivers the message to Edge1:
Hub opens an SMTP session with Edge1.
Edge1 advertises shadow redundancy support.
Hub notifies Edge1 to track discard status.
Hub submits the message to Edge1.
Edge1 acknowledges receipt of the message and registers Hub1 to receive discard information for the message.
Hub moves the message to the shadow queue for Edge1 and marks Edge1 as the primary server. Hub becomes the shadow server.
Edge1 delivers the message to the next hop:
Edge1 submits message to a third-party e-mail server.
The third-party e-mail server acknowledges the message's receipt.
Edge updates the discard status for the message as delivery complete.
If the message is delivered successfully, when Hub queries Edge1 for discard status:
At
end of each SMTP session with Edge1, Hub queries Edge1 for the discard
status on messages previously sent. If Hub has not sent any other
messages to Edge1, it will open an SMTP session with Edge1 to query for
the discard status after five minutes and will fail over three failures
or 15 minutes. This time can be configured using Set-TransportConfig with the ShadowHeartbeatTimeoutInterval parameter. The number of retries can be configured by running Set-TransportConfig -ShadowHeartbeatRetryCount.
Edge1
checks the local discard status and sends back the list of messages
registered to Hub1 that have been delivered and then removes the discard
information.
Hub deletes the delivered messages from its shadow queue.
If the message delivery fails, then Hub queries Edge1 for discard status and resubmits the message:
If
Hub cannot contact Edge1, Hub resumes the primary role and resubmits
the messages in the shadow queue to another available transport server,
Edge2.
The resubmitted messages are delivered to Edge2, and the workflow starts from step 1.
The Shadow Redundancy Manager (SRM) is the core component of a Transport server responsible for managing shadow redundancy. The SRM
is responsible for maintaining the shadow server for all of its primary
messages. The SRM is also responsible for maintaining the following
information for all the shadow messages in its shadow queues:
Determining when the shadow server should take ownership of shadow messages, thus making it the primary server
Maintaining the list and checking primary server availability for each shadow message
Processing discard notifications from primary servers
Removing the shadow messages from the database once after receiving the discard notification
Sending the discard status to the shadow servers
Shadow redundancy does not require any sort of configuration. When multiple transport servers are deployed they will automatically negotiate the use of shadow redundancy. When multiple Hub Transport servers are deployed in each Active Directory site each e-mail
message will exist in two places while in transit. Because each message
exists in two locations you may consider deploying Hub Transport
servers without RAID-protected disks because the in transit e-mail
messages will exist on another server and not need to be recovered. It
is not always advantageous to deploy transport servers without redundant
storage for the message queue as shadow redundancy does not protect e-mail messages in the transport dumpster.
In configurations with a multi-site DAG as well as others that
consistently maintains a number of e-mail messages in the transport
dumpster because of transaction log replication latency you should
store the message queue on redundant storage to reduce the probability
of losing transport dumpsters data. You can determine the number of items in the transport dumpster by viewing the Dumpster Item CountMSExchangeTransport Dumpster performance object using Performance Monitor or by trending this counter using a solution like Microsoft System Center Operations Manager. counter on the
To reduce the likelihood of a
server failure causing a loss of e-mail, the Mailbox Submission service
on a DAG member first attempts to load-balance submission requests
across other Hub Transport servers in the same Active Directory site. If
the Hub Transport role is installed on the DAG member and it cannot
submit messages to any other Hub Transport server in the site, it will
fall back to the local Hub Transport server.
1.1. Inbound E-mail Redundancy
Another form of shadow redundancy called delayed acknowledgement is used in scenarios when a transport server receives a message from a mail server that doesn't support shadow
redundancy. Rather than immediately confirming receipt of the message
from the submitting service, it delays sending an acknowledgement until
it has confirmed that the message has been successfully delivered.
For inbound e-mail delivery
with Edge or Hub Transport servers, the typical way to provide
redundancy is to use an MX record for each of the e-mail servers
accessible for e-mail delivery. MX records are weighted records in DNS that point to the e-mail servers responsible for receiving mail for a domain. The MX
records with a lower weighting will be attempted before higher-weighted
records. Records that have the same weight will be load balanced. Using
MX records to provide this redundancy is part of the way SMTP was
designed, so this configuration is often sufficient. In some instances
where large numbers of SMTP servers are deployed, you may choose to use
network load balancing to have more control over the inbound SMTP
traffic, but load balancing should never be used inside the Exchange
organization or against the Default Receive Connector on each Hub
Transport server. Load balancing and redundancy are built in to the
transport service.
Note:
More information about MX records and how they are used can be found in RFC 2821.